Blind Estimation of the Speech Transmission Index for Speech Quality Prediction

نویسندگان

  • Prem Seetharaman
  • Gautham J. Mysore
  • Paris Smaragdis
  • Bryan Pardo
چکیده

The speech transmission index (STI) of a listening position within a given room indicates the quality and intelligibility of speech uttered in that room. The measure is very reliable for predicting speech intelligibility in many room conditions but requires an STI measurement of the impulse response for the room. We present a method for blindly estimating the STI without measuring or modeling the impulse response of the room using deep convolutional neural networks. Our model is trained entirely using simulated room impulse responses combined with clean speech examples from the DAPS dataset [1] and works directly on PCM audio. Our experiments show that our method predicts true STI with a high degree of accuracy – an average error of under 4%. It can also distinguish between different STI conditions to a level of granularity that is comparable to humans.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic Study of an Auditorium by the Determination of Reverberation Time and Speech Transmission Index

The quality of the communication between teachers and students and ultimately, of classroom education itself, is closely linked to the acoustic quality of the auditorium. This acoustic quality can be characterized based on the reverberation time (RT), speech transmission index (STI) and the sound insulation. In this context, an acoustic study was conducted in an auditorium located in the Higher...

متن کامل

Using FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder

Storaging or transmission of speech signals at very low bit rate is a hot area in the field of speech processing. We used stochastic inter-frame interpolators and vector quantization (VQ) as a new method for developing a high quality 1200 BPS speech vocoder. The objective and subjecgtive test results show that performance of the new vocoder is compairable with 4800 BPS standard vocoders (as CELP).

متن کامل

Using FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder

Storaging or transmission of speech signals at very low bit rate is a hot area in the field of speech processing. We used stochastic inter-frame interpolators and vector quantization (VQ) as a new method for developing a high quality 1200 BPS speech vocoder. The objective and subjecgtive test results show that performance of the new vocoder is compairable with 4800 BPS standard vocoders (as CELP).

متن کامل

Joint Source-Channel CELP Coding

The method of speech coding CELP is extensively used in much voice communication-, multimedia-, video conference and other systems. There are a lot of papers related to CELP coding characteristics improvement for a better speech quality after decoding. Most of the papers are dedicated to lower the rate of CELP coded speech transmission. One problem related to low rate CELP speech transmission w...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018